Skip to content

Support External Postgres DB #276

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 82 commits into from
Jul 23, 2025
Merged

Support External Postgres DB #276

merged 82 commits into from
Jul 23, 2025

Conversation

ewilliams-cloudera
Copy link
Collaborator

No description provided.

jkwatson and others added 30 commits July 15, 2025 13:09
lastFile:ui/src/pages/Settings/MetadataDBFields.tsx
lastFile:llm-service/app/config.py
lastFile:ui/src/pages/Settings/MetadataDBFields.tsx
@baasitsharief baasitsharief requested a review from Copilot July 22, 2025 21:56
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for external PostgreSQL databases as an alternative to the embedded H2 database for metadata storage. The key feature allows users to configure and validate PostgreSQL connections through the UI while maintaining backward compatibility with H2.

  • Introduces metadata database configuration options in the UI with PostgreSQL connection validation
  • Adds comprehensive validation for metadata database connections alongside existing storage and model validation
  • Updates backend infrastructure to support PostgreSQL metadata databases with proper JDBC handling

Reviewed Changes

Copilot reviewed 37 out of 41 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
ui/src/pages/Settings/MetadataDBFields.tsx New component for configuring metadata database providers with PostgreSQL connection testing
ui/src/pages/Settings/AmpSettingsPage.tsx Integrates metadata database fields and updates validation logic
llm-service/app/services/amp_metadata/init.py Adds PostgreSQL validation logic and metadata database configuration support
backend/src/main/java/com/cloudera/cai/util/db/JdbiUtils.java Adds JDBC connection testing utility with command-line interface
backend/src/main/java/com/cloudera/cai/util/db/RdbConfig.java Updates PostgreSQL URL handling and database name extraction logic
Comments suppressed due to low confidence (2)

ui/src/main.tsx:54

  • [nitpick] The parameter name has changed from _ to failureCount but the first parameter in the retry function signature should typically be named failureCount or similar. However, the original parameter was intentionally named _ to indicate it was unused. Since it's now being used on line 62, the rename is appropriate.
      retry: (failureCount: number, error: Error) => {

ui/src/pages/Settings/MetadataDBFields.tsx:115

  • [nitpick] The error message could be more specific about which fields are missing. Consider: 'Please provide all required fields: JDBC URL, username, and password.'
      messageQueue.error("JDBC URL, username, and password are required for testing connection.");

str(db_type),
]
try:
result = subprocess.run(cmd, capture_output=True, text=True, timeout=15)
Copy link
Preview

Copilot AI Jul 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The subprocess call uses user-provided input (db_url, username, password) directly in the command without proper sanitization. While these are passed as separate arguments (safer than shell=True), consider validating input formats to prevent injection attacks.

Copilot uses AI. Check for mistakes.

@@ -153,7 +153,15 @@ public static String buildDatabaseServerConnectionString(RdbConfig rdb) {
}

if (rdb.isPostgres()) {
Copy link
Preview

Copilot AI Jul 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The regex pattern for PostgreSQL URL parsing is complex and could benefit from a comment explaining its structure and what each group captures for future maintainability.

Suggested change
if (rdb.isPostgres()) {
if (rdb.isPostgres()) {
// Regex pattern to parse PostgreSQL URLs:
// ^jdbc:postgresql:(//[^/]+/)?(\\w+)(.*)
// - (//[^/]+/)?: Matches the optional host and port part of the URL (e.g., //localhost:5432/).
// - (\\w+): Matches the database name (e.g., mydatabase).
// - (.*): Matches any additional parameters or options in the URL (e.g., ?ssl=true).

Copilot uses AI. Check for mistakes.

if not db_url.startswith("jdbc:"):
return ValidationResult(valid=False, message="Invalid JDBC URL format.")

if not username.isalnum() or not (1 <= len(username) <= 16):
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why limit the username to 16 characters? Not sure we need this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also not sure if alphanumeric usernames are the right restriction

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RDS has these constraints when me and Liu searched it.

@ewilliams-cloudera ewilliams-cloudera merged commit 3be2e68 into main Jul 23, 2025
3 checks passed
@ewilliams-cloudera ewilliams-cloudera deleted the mob/main branch July 23, 2025 16:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants